17 research outputs found

    Behavioral Privacy Risks and Mitigation Approaches in Sharing of Wearable Inertial Sensor Data

    Get PDF
    Wrist-worn inertial sensors in activity trackers and smartwatches are increasingly being used for daily tracking of activity and sleep. Wearable devices, with their onboard sensors, provide appealing mobile health (mHealth) platform that can be leveraged for continuous and unobtrusive monitoring of an individual in their daily life. As a result, an adaptation of wrist-worn devices in many applications (such as health, sport, and recreation) increases. Additionally, an increasing number of sensory datasets consisting of motion sensor data from wrist-worn devices are becoming publicly available for research. However, releasing or sharing these wearable sensor data creates serious privacy concerns of the user. First, in many application domains (such as mHealth, insurance, and health provider), user identity is an integral part of the shared data. In such settings, instead of identity privacy preservation, the focus is more on the behavioral privacy problem that is the disclosure of sensitive behaviors from the shared sensor data. Second, different datasets usually focus on only a select subset of these behaviors. But, in the event that users can be re-identified from accelerometry data, different databases of motion data (contributed by the same user) can be linked, resulting in the revelation of sensitive behaviors or health diagnoses of a user that was neither originally declared by a data collector nor consented by the user. The contributions of this dissertation are multifold. First, to show the behavioral privacy risk in sharing the raw sensor, this dissertation presents a detailed case study of detecting cigarette smoking in the field. It proposes a new machine learning model, called puffMarker, that achieves a false positive rate of 1/6 (or 0.17) per day, with a recall rate of 87.5%, when tested in a field study with 61 newly abstinent daily smokers. Second, it proposes a model-based data substitution mechanism, namely mSieve, to protect behavioral privacy. It evaluates the efficacy of the scheme using 660 hours of raw sensor data collected and demonstrates that it is possible to retain meaningful utility, in terms of inference accuracy (90%), while simultaneously preserving the privacy of sensitive behaviors. Finally, it analyzes the risks of user re-identification from wrist-worn sensor data, even after applying mSieve for protecting behavioral privacy. It presents a deep learning architecture that can identify unique micro-movement pattern in each wearer\u27s wrists. A new consistency-distinction loss function is proposed to train the deep learning model for open set learning so as to maximize re-identification consistency for known users and amplify distinction with any unknown user. In 10 weeks of daily sensor wearing by 353 participants, we show that a known user can be re-identified with a 99.7% true matching rate while keeping the false acceptance rate to 0.1% for an unknown user. Finally, for mitigation, we show that injecting even a low level of Laplace noise in the data stream can limit the re-identification risk. This dissertation creates new research opportunities on understanding and mitigating risks and ethical challenges associated with behavioral privacy

    The Pakistan risk of myocardial infarction study: A resource for the study of genetic, lifestyle and other determinants of myocardial infarction in south Asia

    Get PDF
    The burden of coronary heart disease (CHD) is increasing at a greater rate in South Asia than in any other region globally, but there is little direct evidence about its determinants. The Pakistan Risk of Myocardial Infarction Study (PROMIS) is an epidemiological resource to enable reliable study of genetic, lifestyle and other determinants of CHD in South Asia. By March 2009, PROMIS had recruited over 5,000 cases of first-ever confirmed acute myocardial infarction (MI) and over 5,000 matched controls aged 30-80 years. For each participant, information has been recorded on demographic factors, lifestyle, medical and family history, anthropometry, and a 12-lead electrocardiogram. A range of biological samples has been collected and stored, including DNA, plasma, serum and whole blood. During its next stage, the study aims to expand recruitment to achieve a total of about 20,000 cases and about 20,000 controls, and, in subsets of participants, to enrich the resource by collection of monocytes, establishment of lymphoblastoid cell lines, and by resurveying participants. Measurements in progress include profiling of candidate biochemical factors, assay of 45,000 variants in 2,100 candidate genes, and a genomewide association scan of over 650,000 genetic markers. We have established a large epidemiological resource for CHD in South Asia. In parallel with its further expansion and enrichment, the PROMIS resource will be systematically harvested to help identify and evaluate genetic and other determinants of MI in South Asia. Findings from this study should advance scientific understanding and inform regionally appropriate disease prevention and control strategies

    MTeeth: Identifying Brushing Teeth Surfaces Using Wrist-Worn Inertial Sensors

    No full text
    Ensuring that all the teeth surfaces are adequately covered during daily brushing can reduce the risk of several oral diseases. In this paper, we propose the mTeeth model to detect teeth surfaces being brushed with a manual toothbrush in the natural free-living environment using wrist-worn inertial sensors. To unambiguously label sensor data corresponding to different surfaces and capture all transitions that last only milliseconds, we present a lightweight method to detect the micro-event of brushing strokes that cleanly demarcates transitions among brushing surfaces. Using features extracted from brushing strokes, we propose a Bayesian Ensemble method that leverages the natural hierarchy among teeth surfaces and patterns of transition among them. For training and testing, we enrich a publicly-available wrist-worn inertial sensor dataset collected from the natural environment with time-synchronized precise labels of brushing surface timings and moments of transition. We annotate 10,230 instances of brushing on different surfaces from 114 episodes and evaluate the impact of wide between-person and within-person between-episode variability on machine learning model\u27s performance for brushing surface detection

    Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams

    No full text
    The field of mobile health (mHealth) has the potential to yield new insights into health and behavior through the analysis of continuously recorded data from wearable health and activity sensors. In this paper, we present a hierarchical span-based conditional random field model for the key problem of jointly detecting discrete events in such sensor data streams and segmenting these events into high-level activity sessions. Our model includes higher-order cardinality factors and inter-event duration factors to capture domain-specific structure in the label space. We show that our model supports exact MAP inference in quadratic time via dynamic programming, which we leverage to perform learning in the structured support vector machine framework. We apply the model to the problems of smoking and eating detection using four real data sets. Our results show statistically significant improvements in segmentation performance relative to a hierarchical pairwise CRF

    WristPrint: Characterizing User Re-identification Risks from Wrist-worn Accelerometry Data

    No full text
    Public release of wrist-worn motion sensor data is growing. They enable and accelerate research in developing new algorithms to passively track daily activities, resulting in improved health and wellness utilities of smartwatches and activity trackers. But, when combined with sensitive attribute inference attack and linkage attack via re-identification of the same user in multiple datasets, undisclosed sensitive attributes can be revealed to unintended organizations with potentially adverse consequences for unsuspecting data contributing users. To guide both users and data collecting researchers, we characterize the re-identification risks inherent in motion sensor data collected from wrist-worn devices in users\u27 natural environment. For this purpose, we use an open-set formulation, train a deep learning architecture with a new loss function, and apply our model to a new data set consisting of 10 weeks of daily sensor wearing by 353 users. We find that re-identification risk increases with an increase in the activity intensity. On average, such risk is 96% for a user when sharing a full day of sensor data

    Automated Detection of Stressful Conversations Using Wearable Physiological and Inertial Sensors

    No full text
    Stressful conversation is a frequently occurring stressor in our daily life. Stressors not only adversely affect our physical and mental health but also our relationships with family, friends, and coworkers. In this paper, we present a model to automatically detect stressful conversations using wearable physiological and inertial sensors. We conducted a lab and a field study with cohabiting couples to collect ecologically valid sensor data with temporally-precise labels of stressors. We introduce the concept of stress cycles, i.e., the physiological arousal and recovery, within a stress event. We identify several novel features from stress cycles and show that they exhibit distinguishing patterns during stressful conversations when compared to physiological response due to other stressors. We observe that hand gestures also show a distinct pattern when stress occurs due to stressful conversations. We train and test our model using field data collected from 38 participants. Our model can determine whether a detected stress event is due to a stressful conversation with an F1-score of 0.83, using features obtained from only one stress cycle, facilitating intervention delivery within 3.9 minutes since the start of a stressful conversation

    Association of phosphodiesterase 4D gene with ischemic stroke in a Pakistani population

    No full text
    Background and Objectives— Identification of STRK1 locus by the deCODE group followed by the discovery of phosphodiesterase 4D (PDE4D) gene in strong association with ischemic stroke patients has provided useful insights toward understanding the genetic etiology of the disease. In this study, we aimed at investigating the association between 3 polymorphisms of the PDE4D gene and ischemic stroke in the Pakistani population. Methods— Three polymorphisms in PDE4D gene were analyzed in 200 patients of ischemic stroke and 250 controls of Pakistani origin using polymerase chain reaction-restriction fragment length polymorphism method. Data were coded and entered in SPSS Windows (version 12.0). Odds ratios and 95% CIs were calculated using multivariate logistic regression analysis. Results— Marker SNP83(rs966221) was found significantly associated with ischemic stroke on univariate and multivariate analysis (PConclusion— The association of PDE4D variation with ischemic stroke extends to the Pakistani population and supports a role for phosphodiesterases in stroke pathogenesis

    Using novel mobile sensors to assess stress and smoking lapse

    No full text
    Mobile sensors can now provide unobtrusive measurement of both stress and cigarette smoking behavior. We describe, here, the first field tests of two such methods, cStress and puffMarker, that were used to examine relationships between stress and smoking behavior and lapse from a sample of 76 smokers motivated to quit smoking. Participants wore a mobile sensors suite, called AutoSense, which collected continuous physiological data for 4 days (24-hours pre-quit and 72-hours post-quit) in the field. Algorithms were applied to the physiological data to create indices of stress (cStress) and first lapse smoking episodes (puffMarker). We used mixed effects interrupted autoregressive time series models to assess changes in heart rate (HR), cStress, and nicotine craving across the 4-day period. Self-report assessments using ecological momentary assessment (EMA) of mood, withdrawal symptoms, and smoking behavior were also used. Results indicated that HR and cStress, respectively, predicted smoking lapse. These results suggest that measures of traditional psychophysiology, such as HR, are not redundant with cStress; both provide important information. Results are consistent with existing literature and provide clear support for cStress and puffMarker in ambulatory clinical research. This research lays groundwork for sensor-based markers in developing and delivering sensor-triggered, just-in-time interventions that are sensitive to stress-related lapser risk factors
    corecore